Integrated Hardware and Software Fault Tolerance for Real Time Applications
نویسنده
چکیده
A distributed fault tolerant system for real time process control based on an enhancement of the distributed recovery block is described. Coverage is provided for failures in hardware, system software, networks, and application software. Fault tolerance provisions are introduced at the system level and in application software using an architecture based on the distributed recovery block (DRB). This implementation allows use of standard off-the-shelf hardware and software components providing life cycle cost and extensibility benefits. Maintainability is enhanced through an automated restart capability and a logging function.
منابع مشابه
Model-based development of fault-tolerant real-time systems
The design of fault-tolerant real-time systems is a complex task. The system must not only satisfy real-time requirements, but it must also deliver the specified functionality in the presence of both hardware and software faults. To achieve fault-tolerance, the system has to use redundancy. This redundancy is usually achieved by replicating hardware units and executing the application within a ...
متن کاملSoftware Diversity and Fault-Tolerance: An Overview
The design of reliable and fault-free software is of a major concern for safety-critical real-time and distributed applications. The fault tolerant community addresses these problems through redundancy in hardware components and by diversity, using different software components. Diversity has been used for many years now as a computer defence mechanism to achieve an acceptable degree of fault-t...
متن کاملA framework backbone for software fault tolerance in embedded parallel applications
The DIR net (detection-isolation-recovery net) is the main module of a software framework fi)r the development of embedded supercornputing applications. This framework provides a set of functional elements, collected in a library, to improve the dependability attributes of the applications (especially the availability) . The DIR net enables these functional elements to cooperate and enhances th...
متن کاملFault Tolerance for Multiprocessor Systems Via Time Redundant Task Scheduling
Fault tolerance is often considered as a good additional feature for multiprocessor systems but nowadays it is becoming an essential attribute. Fault tolerance can be achieved by the use of dedicated customized hardware that may have the disadvantage of large cost. Another approach to fault tolerance is to exploit existing redundancy in multiprocessor systems via a task scheduling software stra...
متن کاملEFTOS: A Software Framework for More Dependable Embedded HPC Applications
Within the ESPRIT project EFTOS (Embedded Fault-Tolerant Supercomputing), a framework is developed to integrate fault tolerance flexibly and easily into distributed embedded HPC applications . This framework consists of a variety of reusable fault tolerance modules acting at different levels. The cost and performance overhead of generic Operating System and Hardware level fault tolerance mechan...
متن کامل